Equivalence of Games with Probabilistic Uncertainty and 
Partial-observation Games 



Krishnendu Chatterjee 1 , Martin Chmelik 1 , and Rupak Majumdar 2 

1 1ST Austria (Institute of Science and Technology Austria) 
2 MPI-SWS, Germany 



Abstract. We introduce games with probabilistic uncertainty, a natural model for controller synthesis in which 
the controller observes the state of the system through imprecise sensors that provide correct information about 
the current state with a fixed probability. That is, in each step, the sensors return an observed state, and given the 
observed state, there is a probability distribution (due to the estimation error) over the actual current state. The 
controller must base its decision on the observed state (rather than the actual current state, which it does not know). 
On the other hand, we assume that the environment can perfectly observe the current state. We show that our 
model can be reduced in polynomial time to standard partial-observation stochastic games, and vice-versa. As a 
consequence we establish the precise decidability frontier for the new class of games, and for most of the decidable 
problems establish optimal complexity results. 

1 Introduction 

In a control system, a controller interacts with its environment through sensors and actuators. The controller observes 
the state of the environment through a set of sensors, computes a control signal that depends on the history of observed 
sensor readings, and feeds the control signal to the environment through actuators. The state of the environment is 
then updated as a function of the control signal as well as a disturbance signal that models external inputs to the 
environment. In a reactive setting, the sense-compute-actuate cycle repeats forever, resulting in an infinite trace of 
environment states. The objective of the controller is to ensure that the trace belongs to a given specification of "good" 
traces. The controller synthesis problem asks, given the dynamical law that specifies how the environment state changes 
according to the controller inputs and external disturbances, and a specification of good traces, to synthesize a control 
law that ensures that the environment traces are good, no matter how external disturbances behave. 

Controller synthesis has been studied extensively for deterministic games with w-regular specifications 151141131 . 
In this setting, the problem is modeled as a game on a graph. The vertices of the graph represent system states, and 
are divided into "controller states" and "disturbance states." At a controller state, the controller chooses an outgoing 
edge and moves to a neighboring vertex along this edge. At a disturbance state, the disturbance chooses an outgoing 
edge and moves along this edge. This continues ad infinitum, defining a sequence of states. If this sequence satisfies 
the specification, the controller wins; otherwise, the disturbance wins. The games are called perfect observation, since 
both players have exact knowledge of the current state and the history of the game. 

The study of perfect-observation deterministic games have been extended to systems with partial observation, in 
which the controller can only observe part of the environment's state 115171 . and to stochastic dynamics [12 8 1011 II . 
in which the state updates happen according to a probabilistic law. 

The "standard model" of partial-observation stochastic games [7 3 2] is described as an extension to the above 
graph model, by fixing an equivalence relation on the vertices (the "observation function"), and stipulating that the 
controller only sees the equivalence class of the current vertex, not the particular vertex the state is in. In addition, the 
transitions of the graph are stochastic: the controller and the disturbance each choose some move, and the next vertex 
is chosen according to a probability distribution based on the current vertex and the chosen move. 

In this paper, we introduce a different, albeit natural, model of probabilistic uncertainty in controller synthesis. 
Consider a state given by n bits. The sensors used to measure the state are typically not perfect, and observing the state 
through the sensor results in some bits being flipped with some known probability (probabilistic noise). In applica- 
tions where the controller observes the state bits through a network, then the probabilistic noise in the communication 
channels results in bits being flipped with some known probability (according to the classical Shannon's communica- 
tion channel model). Thus, the controller observes n bits through the sensor, and this estimate defines a probability 



distribution over the state space for the current state. In contrast, we allow the disturbance to precisely observe the 
state, corresponding to a worst case assumption on the disturbance. The objective of the controller is to find a strategy 
that ensures that the system satisfies the specification under this probabilistic uncertainty on the current state. We dis- 
tinguish between two models of the disturbance. In the first model, the disturbance observes the correct sequence of 
states as well as both the observation of the controller and the sequence of controller moves. In the second model, the 
disturbance observes the correct sequence of states as well as the sequence of controller moves (but not the observation 
of the controller). It turns out that the two models give rise to subtle differences in defining the probability measures 
on the games, as well as different complexities in the solution algorithms. 

Our model (which we refer as games with probabilistic uncertainty) is inspired by analogous models of state 
estimation under probabilistic noise in continuous control systems. We believe this model of games with probabilistic 
uncertainty naturally captures the behavior of many sensor-based control systems. Intuitively, the standard model 
of partial-observation games represent "partial but correct information" where the controller can observe correctly 
only the first k < n bits of the state (i.e., the observation is partial as the controller observes only a part of the 
state bits, but the information about the observed state bits is always correct). In contrast, our model of games with 
probabilistic uncertainty represent "complete but uncertain information" where the controller can observe all the n bits 
of the state but with uncertainty of observation (i.e., the controller can observe all the bits, but each bit is correct with 
some probability). Since the type of uncertain information in our model is very different from the standard models of 
partial-observation games studied in the literature, the relationship between them is not immediate. 

Our main contribution, along with the introduction of the natural model of games with probabilistic uncertainty, is 
establishing the equivalence of the new class of games and partial-observation games. Our main technical result is a 
polynomial-time reduction from this new model of games with probabilistic uncertainty to standard partial-observation 
games, and a converse reduction from partially-observable Markov decision processes (POMDPs) to games with 
probabilistic uncertainty. The results to establish the equivalence of the two classes of games which represent two 
different notions of information (partial but correct vs complete but uncertain) are quite intricate. For example, for the 
new class of games the inductive definition of probability measure is subtle and different from the classical definition 
of probability measure for probabilistic systems 1 17 9|. This is because the controller observes a history that can be 
completely different from the actual history, whereas the environment (or disturbance) observes the actual history. 
We first inductively define a probability measure of observed history, given the actual history, and use it to define the 
probability measure inductively. We show how our polynomial constructions for reduction capture the subtleties in the 
probability measure, and by establishing precise mapping of strategies (which is the heart of the proof of correctness 
of reduction) we obtain the desired equivalence result. 

In the positive direction, our reduction allows us to solve controller synthesis problems for games with proba- 
bilistic uncertainty against w-regular specifications, using algorithms of 17121 . In the negative direction, we get lower 
bounds on the hardness of problems by using known lower bounds for POMDPs using the hardness results of 11161 . 
In particular, with our reductions we establish precisely the decidability frontier of games with probabilistic uncer- 
tainty for various classes of parity objectives (a canonical form to express cj-regular specifications); and for most of 
the decidable problems we establish EXPTIME-complete bounds, and in some cases 2EXPTIME upper bounds and 
EXPTIME lower bounds (see TableQ]). Moreover, our reduction allows the rich body of algorithms (such as symbolic 
and anti-chain based algorithms 17121 ) for partial-observation games, along with any future algorithmic developments 
for partial-observation games, to be applicable to solve games with probabilistic uncertainty. In summary, our results 
provide precise decidability frontier, optimal complexity (in most cases), and algorithmic solutions for games with 
probabilistic uncertainty, that is a natural model for control problems with state estimation under probabilistic noise. 

2 Games with Probabilistic Uncertainty 

In this section we introduce a class of games with probabilistic imperfect information, and call them games with 
probabilistic uncertainty. 

Probability distribution. A probability distribution on a finite set A is a function k : A — > [0, 1] such that YlaeA K ( a ) = 
1. We denote by T>(A) the set of probability distributions on A. 

Game structures with probabilistic uncertainty. A game structure with probabilistic uncertainty consists of a tuple 
Q = (L, Sj, Eq, A, un), where (a) L is a set of locations; (b) Ei and Sq are two sets of input and output alphabets, 
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respectively; (c) A : L x Si x So — > D(L) is a probabilistic transition function that given a location, an input and 
an output letter gives the probability distribution over the next locations; and (d) un : L — > V(L) is the probabilistic 
uncertainty function that given the true current location describes the probability distribution of the observed location. 
If un is the identity function we obtain perfect-observation games. 

Intuitively, a game proceeds as follows. The game starts at some location £ £ L, Player 1 observes a state drawn 
from the distribution un(£), which represents a potentially faulty observation process. Intuitively, at every step the 
player can observe the value of all variables that corresponds to the state of the game, but there is a probability that 
the observed value of some variables are is incorrect. Player 2 observes the "correct" state I. Given the observation of 
the history of the game so far, Player 1 picks an input alphabet a 1 G S^ Player 2 then picks an output letter a° 6 S a : 
we consider two variants, (1) Player 2 only observes the history of correct locations and the moves of the players; 
and (2) Player 2 observes the history of correct locations, the moves of the players, and also observes the history of 
observed locations of Player 1 . The state of the game is updated to £' with probability A(£, a % , <j°)(£'). This process is 
repeated ad infinitum. 

Plays. A play of Q is a sequence p = £o°o <T o^i f7 i <T i • • • of locations, input letter, and output letter, such that for all 
j > we have A(£j, er], cr| )(£j+i) > 0. The prefix up to £ n of the play pis denoted by p(n), its length is \p(n)\ = n+1 
and its last element is Last(p(n)) = £ n . The set of plays in Q is denoted by P\ays(Q), and the set of corresponding 
finite prefixes is denoted Prefs((J). 

Strategies. A strategy for Player 1 observes the finite prefix of a play and then selects an input letter (pure strategies) or a 
probability distribution over input letters in Si. Formally, a pure strategy for Player 1 is a function a : Prefs(^) — > Si, 
and a randomized strategy for Player 1 is a function a : Prefs(C?) — > V(Si). Similarly, pure and randomized strategies 
for Player 2 are defined as functions [3 : Prefs(C?) x Si — » S a and (3 : Prefs(^) X Si — > T>(S ), respectively. Note 
that Player 2 sees Player 1 's choice of input action at each step. In the case where Player 2 observes also the history of 
observed locations, the pure and randomized strategies are defined as functions (3 : Prefs(C?) x Preh(Q) x Si — » S Q 
and f3 : Prefs((?) x Prefs((?) x Si — > V(S ), respectively, where the output letter is chosen based on the original 
history and observed history. We refer to strategies that observes both histories as "all-powerful" strategies for Player 2. 

Outcomes. The outcome of two randomized strategies a for Player 1 and f3 for Player 2 from a location £ 6 L is 
the set of plays p = £ o-qO-q . . . such that (1) £ = £ , (2) there exists a sequence £$£[ . . . such that un(£j)(£'j) > 
for each j > 0, (3) for each j > 0, we have a(^o-jcrg . . •^•)( cr j) > and P(p(j) > (if /3 is an all- 
powerful strategy, then p(p(j),l' aia^e[ . . . £' p <rj)(cr?) > 0), and A(£ j ,<7 p a?)(£ j+1 ) > 0. The primed sequence 
£' £[ . . . gives the sequence of observations made by Player 1 using the probabilistic uncertainty function. Note that 
this sequence may be incorrect with some probability due to probabilistic uncertainty in the observation. We denote 
this set of plays as Outcome(C/, £, a, (3). The outcome of two pure strategies is defined analogously, considering pure 
strategies as degenerate randomized strategies which pick a letter with probability one. The outcome set of the pure 
(resp. randomized) strategy a for Player 1 in Q is the set Outcomei(C7, £, a) of plays p such that there exists a pure 
(resp. randomized) strategy (3 for Player 2 with p G Outcome({?, £, a, (3). The outcome set Outcome2(C?, £, (3) for 
Player 2 is defined symmetrically. 

Probability measure. Given strategies a and (3, we define the probability measure Pr"' (•). The definition of the 
probability measure is subtle and non-standard as the prefix that Player 1 observes can be completely different from 
the original history. For a finite prefix p e Prefs(CJ), let Cone( j o) denote the set of plays with p as prefix. We will define 
Pr" (•) for cones, and then by Caratheodary extension theorem [4| there is a unique extension to all measurable sets 
of paths. To define the probability measure we also need to define a function ObsSeq( j o), that given a finite prefix p, 
gives the probability distribution over finite prefixes p', such that ObsSeq(p)( j o') denotes the probability of observing 
p' given the correct prefix is p. The base case is as follows: 

Pr, Q /(Cone(^ )) - 1; ObsSeq(4>)(0 = un(£ )(£'). 
The inductive definition of ObsSeq is as follows: for a prefix p of length n + 1 

ObsSeq(pa>^„ +1 )(A>X+i) = ObsSeq (p)(p') • un(£ n+1 )(£' n+1 ) 

Given a sequence p = £Qa\aQ£\a\a\ ...In, we define ActMt(p) = {p = ^o^o^o^i^i^i • • • ^« I ^ — 3 — n ~ 
l.cfj = cr) and a° — a° } the sequences of same length as p such that the sequence of input and output letter matches 
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(i.e., the set of action-matching prefixes). Note that for non action-matching prefixes the observation sequence function 
always assigns probability zero. The inductive case for the probability measure is as follows: for a prefix p of length 
n + 1 with last state £ n , we have 



Pr^(Cone(pa>^„ + i)) = 
Pr,°f (Cone(p)) ■ ( ]T ObsSeq(p)(p') ■ a(p')K) ■ £(p<)K) • 



p'eActMt(jo) 



i.e., ObsSeq(p)(p') gives the probability to observe p' , then a(p')(al l ) denotes the probability to play a l n given the 
strategy and observed sequence p', and since Player 2 observes the correct sequence the probability to play er° is given 
by /3(per^)(<7°) (Player 2 observes p), and the final term A{l n , a l nl cr£) (l n +i) gives the transition probability. If p? is 
an all-powerful strategy, then f3 observes both the correct history p and the observed history p', and then the definition 
is as follows: 



Winning objectives. An objective for Player 1 in Q is a set <f> C Plays(C?) of plays. A play p E Plays(C?) satisfies the 
objective cf>, denoted p \= <f>, if p S <j>. We consider w-regular objectives specified as parity objectives (a canonical form 
to express all w-regular objectives [ 16]). For a play p — ^a^a^ . . ., we denote by pk the fc-th location Ik of the play 
and denote by Inf (p) the set of locations that occur infinitely often in p, that is, Inf (p) = {£ \ Vi3j : j > i and £j = £}. 
We consider the following classes of objectives. 

1. Reachability and safety objectives. Given a set T C L of target locations, the reachability objective Reach (T) 
requires that a location in T be visited at least once, that is, Reach (T) = {p | 3fc > • pk G T}. Dually, the 
safety objective Safe(T) requires that only states in Tbe visited. Formally, Safe(T) = {p | Vfc > • pk G T}. 

2. Biichi and coBuchi objectives. Let T C L be a set of target locations. The Biichi objective Buchi(T) requires that 
a state in Tbe visited infinitely often, that is, Buchi(T) = {p | Inf (p) n T ^ 0}. Dually, the coBuchi objective 
coBuchi(7~) requires that only states in Tbe visited infinitely often. Formally, coBuchi(T) = {p \ Inf (p) C T}. 

3. Parity objectives. For d £ N, let p : L — > {0, 1, . . . , d} be a priority function, which maps each state to a nonneg- 
ative integer priority. The parity objective Parity(p) requires that the minimum priority that occurs infinitely often 
be even. Formally, Parity(p) = {p | min{p(^) | £ e lni(p)} is even}. The Biichi and coBuchi objectives are the 
special cases of parity objectives with two priorities, p : L — > {0, 1} and p : L — y {1, 2}, respectively. 

Sure, almost-sure and positive winning. An event is a measurable set of plays, and given strategies a and (3 for the 
two players, the probabilities of events are uniquely defined. For an objective <p, assumed to be Borel, we denote by 
Pi" (<fi) the probability that <j) is satisfied by the play obtained from the starting location £ when the strategies a and j3 
are used. Given a game Q, an objective (j>, and a location £, we consider the following winning modes: (1) a strategy a 
for Player 1 is sure winning for the objective (j> from £ S L if Outcome(C/ , £,a,/3) C for all strategies j3 for Player 2; 
(2) a strategy a for Player 1 is almost-sure winning for the objective 4> from £ G L if Pr"'' 3 ^) = 1 for all strategies /3 
for Player 2; and (3) a strategy a for Player 1 is positive winning for the objective <p from £ £ L if Pr"'^(0) > for 
all strategies (3 for Player 2. 

Qualitative analysis of a game consists of the computation of the sure, almost-sure and positive winning sets. 
The sure (resp. almost-sure and positive) winning decision problem for an objective consists of a game and a starting 
location £, and asks whether there is a sure (resp. almost-sure and positive) winning strategy from £. 

3 Partial-observation Stochastic Games 

We now recall the usual definition of partial-observation games and their subclasses. We focus on partial-observation 
turn-based probabilistic games, where at each round one of the players is in charge of choosing the next action and the 
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transition function is probabilistic. We will present a polynomial time reduction of games with probabilistic uncertainty 
to these games. 

Partial-observation games. A partial-observation stochastic game (for short partial-observation game or simply a 
game) is a tuple G — (Si U 52, A%, A2, Si U S2, Oi, O2) with the following components: 

1. (State space). S = Si U S2 is a finite set of states, where Si fl S2 = (i.e., Si and S2 are disjoint), states in Si 
are Player 1 states, and states in S2 are Player 2 states. 

2. (Actions). Ai (i = 1, 2) is a finite set of actions for Player i. 

3. (Transition function). For i G {1, 2}, the probabilistic transition function for Player i is the function Si : S^xA; — > 
"D(S3-i) that maps a state € S, and an action a, G Ai to the probability distribution £,(si, a;) over the successor 
states in S^-, (i.e., games are alternating). 

4. C Observations). Oi C 2 s is a finite set of observations for Player 1 that partitions the state space S, and similarly 
02 is the observations for Player 2. These partitions uniquely define functions obs^ : S — > Oi, for i G {1, 2}, that 
map each state to its observation such that s G obsi(s) for all s £ S. We will also consider the special case of 
one-sided games, where Player 2 is perfectly informed (has complete observation), i.e., O2 = S, and obsa(s) = s 
for all s G 5 (i.e., the partition consists of singleton states). 

Special Class: POMDPs. We will consider one special class of partial-observation games called partial-observable 
Markov decision processes (POMDPs), where the action set for Player 2 is singleton (i.e., there is effectively only 
Player 1 and stochastic transitions). Hence we will omit the action set and observation for Player 2 and represent a 
POMDP as the following tuple G = (S, A, S, O), where <5 : S x A -> V{S). 

Plays. In a game, in each turn, for i G {1, 2}, if the current state s is in Si, then Player i chooses an action a £ Ai, 
and the successor state is chosen by sampling the probability distribution Si(s, a). A play in G is an infinite sequence 
of states and actions p = soaoSidi . . . such that for all j > 0, if Sj G Si, for i £ {1,2}, then there exists a,j £ Ai 
such that Si(sj,aj)(sj + i) > 0. The definitions of prefix and length are analogous to the definitions in Section [2] 
For i £ {1,2}, we denote by Prefs,;(G) the set of finite prefixes in G that end in a state in Si. The observation 
sequence of p = SQdoSiai . . . for Player i (i = 1, 2) is the unique infinite sequence of observations and actions, i.e., 
obs(p) = ooaoOiai02 . . . such that Sj £ Oj for all j > 0. The observation sequence for finite sequences (prefix of 
plays) is defined analogously. 

Strategies. A pure strategy in G for Player 1 is a function a : Prefsi(G) — > A±. A randomized strategy in G for 
Player 1 is a function a : Prefsi(G) — > T>(A\). A (pure or randomized) strategy a for Player 1 is observation-based 
if for all prefixes p, p' £ Prefsi(G), if obs(/)) = obs(p'), then a(p) = a(p'). We omit analogous definitions of 
strategies for Player 2. We denote by Ag, Aq, Aq, Bq, Bq, Bq the set of all Player-1 strategies in G, the set of all 
observation-based Player-1 strategies, the set of all pure Player-1 strategies, the set of all Player-2 strategies in G, the 
set of all observation-based Player-2 strategies, and the set of all pure Player-2 strategies, respectively. In the setting 
where Player 1 has partial-observation and Player 2 has complete observation, the set Bq of all strategies coincides 
with the set Bq of all observation-based strategies. We will require the players to play observation-based strategies. 

Outcomes. The outcome of two randomized strategies a (for Player 1) and j3 (for Player 2) from a state s in G is 
the set of plays p = sodQSiCti . . . £ Plays(G), with sq = s, where for all j > 0, if Sj £ Si (resp. Sj £ S2), 
then a(p(j))(aj) > (resp. f3(p(j))(aj) > 0) and 5i(sj,a,j) — Sj+i (resp. S%(sj, a,j) — Sj+i). This set is denoted 
Outcome(G, s, a, 0). The outcome of two pure strategies is defined analogously by viewing pure strategies as ran- 
domized strategies that play their chosen action with probability one. The outcome set of the pure (resp. randomized) 
strategy a for Player 1 in G is the set Outcomei(G, s, a) of plays p such that there exists a pure (resp. randomized) 
strategy 8 for Player 2 with p £ Outcome(G, s, a, 8). The outcome set Outcome2(G, s, 8) for Player 2 is defined 
symmetrically. 

Probability measure. We define the probability measure Pr^ (•) as follows: for a finite prefix p, let Cone(p) denote 
the set of plays with p as prefix. Then we have Pr"'^(Cone(s)) = 1, and for a prefix of length n ending in a Player 1 
state s n we have 

Pr" ,/5 (Cone(pa n s„ + i)) = Pr"' /3 (Cone( / 9)) • a(p)(a n ) ■ Si(s n ,a n )(s n+ i); 
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and the definition when s n is a Player 2 state is similar. For a set Q of finite prefixes, we write Pr" '^(Cone(Q)) for 
Pr^(U peQ Cone(p)). 

The winning modes sure, almost-sure, and positive are defined analogously to Section[2] where we restrict the play- 
ers to play an observation-based strategy. From the results of 117121 1 13 6 1 we obtain the following theorem summarizing 
the results for partial-observation games and POMDPs. 

Theorem 1 ( 17121113 .61). The following assertions hold: 

1. (One-sided games and POMDPs). The sure, almost-sure and positive winning for safety objectives; the sure and 
almost-sure winning for reachability objectives and Bilchi objectives; the sure and positive winning for coBilchi 
objectives; and the sure winning for parity objectives are EXPTIME- complete for one-sided partial-observation 
games (Player 2 perfectly informed) and POMDPs. The positive winning problem for reachability objectives is 
PTIME-complete both for one-sided partial-observation games and POMDPs. 

2. (General partial-observation games). The sure, almost-sure winning for safety objectives, the sure winning for 
parity objectives are EXPTIME-complete for partial-observation games; the almost-sure winning for reachability 
objectives and Bilchi objectives, and the positive winning for safety and coBilchi objectives are 2EXPTIME- 
complete for partial-observation games. The positive winning problem for reachability objectives is EXPTIME- 
complete. 

3. (Undecidability results). The positive winning problem for Bilchi objectives, the almost-sure winning problem for 
coBuchi objectives, and the positive and almost-sure winning problems for parity objectives are undecidable for 
POMDPs. 

4 Reduction: Games with Probabilistic Uncertainty to Partial-observation Games 

We now present a reduction of games with probabilistic uncertainty to classical partial-observation games. Let G = 
(L, Si, So, A, un) be a game with probabilistic uncertainty and we construct a partial-observation game H = (L x 
L U L x L x Si,Ai = Si, A 2 = S ,S = Si U S 2 , d, 2 ) as follows: 

1 . The transition function Si is deterministic and for {£\,£%) G L x L and 07 G Si we have 

S((£i,i 2 ),ai) = (£i,e 2 ,<Ji) 

2. The transition function S 2 captures both A and un and is defined as follows: for (£i,£ 2 , 07) G L x L x Si and 
(To G So we have 

8{{£i,l 2 ,ai),aoWi,Q = A(£i,ai,a )(£\) ■ un(^)(^). 

Intuitively, the first component of the game H keeps track of the real state of the game G, and the second com- 
ponent keeps track of the information available from probabilistic uncertainty. Hence Player 1 is only allowed to 
observe the second component which is the probability distribution over the observable state given the current 
state. 

3. The observation mapping is as follows: we have Oi = L; and obsi(^i, £ 2 ) = obsi(£i, £ 2 ,ai) = £ 2 , i.e., only the 
second component is observable. We will consider two cases for 2 : for the reduction of all-powerful strategies 
we will consider Player 2 has complete-observation, and in the other case we have 2 = L and Player 2 observes 
the first component that represents the correct history: i.e., 0(352(^1,^2) = obs2(li,^2) &i) = l\. 

4. For a parity objective in G given by priority function po : L —> {0, 1, . . . , d}, we consider the priority function 
p H in H as follows: p H ((£,£')) = p H ((£, £', 07)) = p G (£), for all £,£' G L and 07 G £/. 

Correspondence of strategies. We will now establish the correspondence of probabilistic uncertain strategies in G 
and the observation based strategies in H. We present a few notations. For simplicity of presentation, we will use a 
slight abuse of notation: given a history (or finite prefix) pn = soaoSiais 2 a 2 . . . s 2n in H we will represent the history 
as Soaoais 2 a 2 a,3S3 . . . s 2n as the intermediate state is always uniquely defined by the state and the action. Intuitively 
this is removing the stuttering and does not affect parity objectives. 
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Mapping of strategies from G to H. Given a history pn = s aoaiS 2 a 2 a3S3 . . . s 2n in H, such that s 2i = (£ 2 i,£ 2 i), 
we consider two histories in G as follows: 

9i(Ph) = £la Q ai£la 2 a 3 . ..i\ n ; 92{ph) = ^a^ai^a^ . ..£\ n . 

Intuitively, g\ gives the first component (which is the correct history) and g 2 gives the second component (which is the 
observed history). We now define the mapping of strategies from G to H: given strategy olq for Player 1, a strategy 
0G for Player 2, and an all-powerful strategy f3 G for Player 2, in the game G, we define the corresponding strategies 
in H as follows: for a history pn and an action a, for Player 1 we have 

uh{ph) = a G (g 2 (p H )); 

Ph{ph a-i) = (3 G (g 1 (p H ) a;); 

Ph(Ph a,) = (3^( gi (p H ),g 2 (p H ),ai). 

Note that an and 0h are observation-based strategies, and is a strategy with complete-observation, i.e., all- 
powerful strategies are mapped to complete-observation strategies. Hence for all-powerful strategies the reduction 
is to one-sided games. We will use g to denote the mapping of strategies, i.e., an = g{a G ), (3h = ?(/?g), and 
0g 

Mapping of strategies from H to G. We now present the mapping in the other direction. Let p G = £QO-Qa^£\a-\a° ■■■£„, 
and p G = 1\oqOqI\o\g\ . . . £\ be two prefixes in G. Intuitively, the first represent the correct history and the second 
the observed history. Then we consider the following set of histories in H : 

hi{p l G ) = {ph | 9i(ph) = P G Y h 2(p G ) = {pH | 92(ph) = P G }; 

and 

We now define the mapping of strategies. Given an observation-based strategy olh G for Player 1, observation- 
based strategy (3h £ for Player 2, and complete observation-based strategy G Bh, we define the following 
strategies in G: for a correct history p G , observed history p G , and input a 1 we have 

Pg{p g cr 1 ) = I3 h (ph PH€hi{p G ); 

a G (p G ) = a H (p H ); p H e h 2 (p G ); 

Note that since (3h is observation-based it plays the same for all pn £ h\ (p G ), and similarly, since an is observation- 
based it plays the same for all pn £ h 2 (p G ). Also observe that the strategy (3 G is an all-powerful strategy. We will use 
h to denote the mapping of strategies, i.e., a G = h{an), Pg — Ii((3h), and (3 G — h(f3^). 

Given a starting state £q £ G, consider the following probability distribution p in H: p(£o, £) — un(£ )(£). Given 
the mapping of strategies, our goal is to establish the equivalences of the probability measure. We introduce some 
notations required to establish the equivalence. For j > 0, we denote by (rj, r?) the pair of random variables to 
denote the j-th Player 1 state of the game H, and by 0*- and 6° the random variables for the actions following the 
j-th state. Our first lemma establishes a connection of the probability of observing the second component in H given 
the first component along with function ObsSeq. We introduce notations to define two events: given two prefixes 
p G = £lo-io-°£\a\o-° ...£\, and p 2 G = £\a%ul£\a\al ...£\'va.G, let £i, 2 {p Gl p G ) denote the event that for all 
< j < n we have rj = £j,rf = £? and for all < j < n - 1 we have 0j = crj, 6? = a°; and £i(p G ) denote the 
event that for all < j < n we have rj = £j and for all < j < n - 1 we have 9* = crj ,9° — a°. 

Lemma 1. Let p G = ^a^a^alal . . . £\, and p G — ^a^a^a^l . . . £\ be two prefixes in G. Then for all strate- 
gies an and (3h, the probability that the second component sequence in H is p G , given the first component sequence 
is p G is ObsSeq (p G )(p G ), i.e., formally 

Y^^(£ ia (p GlPG ) | £ 1 {p G )) = ObsSeq(^)04). 



7 



Proof. The proof is by induction on the length of the prefixes. The base case is as follows: let the length of prefixes 
p G and p G be 1, with p G = l§ and p G = I. Then we have 

0bsSeq(4)W = M(4>,*); 

as required. We now consider the inductive case: we consider prefixes Pa^n^Z^n+i and PG^n^n^n+v Let us consider 
the events £} l+1 = £l,2{PG a ii a nC+l, PG a >nC+l) and £ l+l = ^(Pc^nOn+l)- Let ^n+i denote the event that 
tI = 4. r « = ^> T n+i = 4+1.^+1 = £ «+i> ^ = 4- and °l = O a " d £~l +1 denote the event that t\ = l\, 
T n+i = ^n+i'On = °n> and ^« = °n- Then by definition we have 

r V ^ C n+1 I C n+lJ — 



(In the numerator all choices are fixed, and 

in denominator all possible choices of the second component) 

= un(^ +1 )(d) (Since ^ ?b+i ^i+M+i) = D 

Note that the crucial fact used in the above proof is in the second equality and the fact is that for all i\ we have 

- A(£ 1 n ,ai l ,a°){e n+1 ) • un(4+i)(^+i) (i.e., it is independent of ? n ). Hence using 
the above equality and inductive hypothesis we have: 

Prr- fe (^+i I Chi) = P4 h ^(£iAPg,Pg) I £i(ph)) ' ^^{t n+1 \ t n+1 ) 

= ObsSeq(p G )(p G ) ' P^»' te | (By inductive hypothesis) 

= bsSeq (p G ) (pg.) • un ) ) (By previous equality) 
= ObsSeq(p^>^ + J(p^>^ +1 ) 
The desired result follows. I 

We will now establish the equivalences of the probabilities of the cones. 
Lemma 2. For all finite prefixes p G in G, the following assertions hold: 

1. For all strategies aa, Pg, fi G (all-powerful), we have 

Pr^ G (Cone04)) = Pr^)^)(Cone(M^))); Pr°°^(Cone(^)) = Pr^^^onefMPc)))- 

2. For all strategies an, 0h, Ph (complete-observation), we have 

P ^)MP H ) {Cone{p i G)) = p^A (C one(/ ll (^))); V ^)MP C H ) {Co ne(p G )) = Pr°-^ (Cone(Mpk)))- 

Proof. We will present the result for the first item, and the proof for second item is identical. Let us denote by 
«h = g(ctc) and (3h = g{Pc)- We will prove the result by induction on the length of the prefixes. The base case 
is as follows: let the length of the prefix p G be 1, with p G — l§. We observe that Pr" o G,/3G (Cone(^o)) = 1, a nd 
p r «ff,fe(Cone(/ii(4))) = Land for all other cones of length 1 the probability is zero. This completes the base case. 
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We now consider the inductive case: by inductive hypothesis we assume that Pr" o G,,3G (Cone(p G )) = 
p r «*.fe(Cone(/ii(pk))); and show that 

Pr^ G (Cone(p G a„M„+i)) = Pr^> te (Cone(Mp G a«M™+i)))- 
Let £ n be the last state of p G . We first consider the left-hand side (LHS): 

Pr, QG < fe (Cone(p 1 G a n b n £ n+1 )) 

= Pr°°'^(Cone(p^)). ( ]T ObsSeq(p G )(,/) • a G (p')(a») • fctfaOQ*) ' A (^n, a n , b n )(£ n+1 ) 



p'6ActMt(pJ,) 

= Pr°"'^(Cone(/n(^)))- f E ObsSeq(p G )Go') • a G (/)(an) ■ /3 G G a«)0„) ■ 4(* n , a„, &»)(*n+i) 

^ p'eActMt(p^) 

E V^ H pH (Cone(h 12 (p G , pi))) ■ a G (p')(a n ) ■ f3 G {p G a n ){b n ) ■ A(£ n , a n , b n )(£ n+1 ) 

p'eActMt(pi,) 

Above the first equality is by definition, the second equality by inductive hypothesis, and the last equality is obtained 
from Lemma[T]as follows: by Lemma[T]we have ObsSeq(p G )(p') = Pt" h '^ h (£i,2(p G , p') \ £i(Pg))' an( ^ nence 

Pr^^(Cone(/M(p G ))). E ObsSeq(p G )(p') 

p'eActMt(pJj) 

2 Pr^'fe(Cone(/ ll (p G )))-Pr^'fe(f lj2 (pJ ; ,p') | ^(p G )) 

p'eActMt(p^) 
p'GActMt(p^) 

We now consider the right-hand side (RHS) Pr^" 3 " (Cone(/ii(p G 

Qn^n^n+i))) and the RHS can be expanded as: 

(below for brevity we write p = hn{Pa, p')) 

E E Pr p H ' te (?) • «n(p)(on) • P H (pa»)(b n ) ■ 5((4,Cg,6„)(Ui,Ci) 

p'eActMt(pi,)C + 1 

Since we have 

a H (h 12 {p G ,p')){a n ) = a G (p')(a n ); and /?# (h 12 (p Gl p')a n )(b n ) = /3 G (p G a„)(&„), 
the above expression for RHS is equivalently described as: 

E E Pr r' to ( Cone (MpG,p'))) ■ ac(p')K) ■ Pa(j>aO(p n ) ■ A(£ n ,a n ,b n )(£ n+1 ) ■ un(£ n+1 )(£' n+1 ) 

p'eActMt(pJj) 

Since un(£„ + i)(^ +1 ) = 1, it follows that LHS is equal to the RHS. The result for correspondence for all- 

powerful strategy (3 G is essentially copy -paste of the above proof replacing appropriately /3 G by /3 G . This completes 
the proof and the desired result follows. I 

It follows that there is a sure, almost-sure, positive winning strategy in G for Parity (p G ) iff there is a corresponding 
one in H for Parity(p^f ) and hence from Theorem[T]we obtain the following result. 

Theorem 2. The following assertions hold: 

1. (All-powerful Player 2). The sure, almost-sure and positive winning for safety objectives; the sure and almost-sure 
winning for reachability objectives and Biichi objectives; the sure and positive winning for coBilchi objectives; 
and the sure winning for parity objectives can be solved in EXPTIME for games with probabilistic uncertainty 
with all-powerful strategies for Player 2. The positive winning for reachability objectives can be solved in PTIME. 
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2. (Not all-powerful Player 2). The sure, almost-sure winning for safety objectives; and the sure winning for parity 
objectives can be solved in EXPTIME; the almost-sure winning for reachability objectives and Biichi objectives; 
the positive winning for safety and coBtichi objectives can be solved in 2EXPTIME for games with probabilistic 
uncertainty without all-powerful strategies for Player 2. The positive winning for reachability objectives can be 
solved in EXPTIME. 



5 Reduction: POMDPs to Games with Probabilistic Uncertainty 

In this section we present a reduction in the reverse direction and show that POMDPs with parity objectives can be 
reduced to games with probabilistic uncertainty and parity objectives. We first present the reduction and then show 
the correctness of the reduction by mapping prefixes, strategies, and establishing the equivalence of the probability 
measure. 

Reduction: POMDPs to games with probabilistic uncertainty. Let H = (S, A, 5, 0) be a POMDP with a parity 
objective <j>, we construct the game of probabilistic uncertainty G = (L, Si, So, A un) as follows: 

- L = S; 

- Si = A; 

-Zo = Ul; 

- Fori? G Land a G Si let A(£, a, -L)(£') — S(£, a) (£'), i.e., the transition function is same as the transition function 
of the POMDP. In other words, the state space is the same, the action choices of the POMDP corresponds to the 
input action choice, and the output action set is singleton, and the transition function mimics the transition function 
of the POMDP. Below we use the probabilistic uncertainty to capture the partial-observation of the POMDP. 

f0 if obs{£) ± obs(f) 

- The uncertainty function is as follows: un (£)(£ ) = < i - £ , , /„a 

The parity objective is the same as the original parity objective. 

Mapping of prefixes. Given a prefix (or a finite history) pn = so a o s i a i s 2 • • ■ s n in H we construct a prefix in G as 
Pg = soao-L s i a i-L s 2 ■ • ■ s n by simply inserting the _L actions. This construction defines a bijection h : Prefs# — > 
Prefsc between prefixes. We can naturally extend the mapping to sets of prefixes. Let 9 C Prefs#, then h'(ip) = 

{h(p)\ P eV}. 

Lemma 3. For prefixes p, p' in G the following assertion holds: 

«, _ . u n J ffn r I Ifobs(h- 1 {p))=obs(h- 1 {p'))=oia 1 o 2 ...a n - 1 o n 
0bsSeq(/9)(p') = < |o,| 

[ Otherwise 

Proof. We prove the result by induction on the length of prefixes. We will only consider p and p' that have the same 
length, as otherwise by definition the observation sequence probability is 0. We first consider the base case. 

Base case. Let £o be the initial state. Then p = £q and let p' = £ for some £ G L. Then: 

ObsSeq^ ,^) = un(£ ,£) = 



|obs(4)| 

if £o and £ have the same observation and otherwise. This proves the base case. 

Inductive step. We now consider prefixes of length n + 1, and by inductive hypothesis the result holds for prefixes of 
length n. Then 

ObsSeq(pa n U n+1 )(p'a n W n+1 ) = ObsSeq(p)(p') ■ un(£ n+1 )(£' n+1 ). 
We now consider two cases to complete the proof. 

- If obsih- 1 (pa n ±£ n+1 )) ^ obs^ip' a n W n+1 )), then either obs(/i- 1 ( /9 )) ^ obs(/j- 1 (p')) or obs{£ n+1 ) / 
obs(^ +1 ). It follows that one of the factors (ObsSeq (/?)(//) or un(£ n+ i)(£' n+1 )) is equal to and hence: 

ObsSeq(pa n ±£ n+1 )(pa n ±£' n+1 ) = 
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- Otherwise, we have obs(/i 1 (pa n ±.£ n +i)) = obs(/i 1 (p'a n -L£' n+1 )) = o\a\02 ■ ■ ■ a n -io n a n o n+ i. Then: 

1 1 1 



ObsSeq(pa n ±£ n+1 )(p' a n ±£' n+1 ) = ObsSeq(p)(p') • un(£ n+1 )(£' n+1 ) = 



The desired result follows. I 

Mapping of strategies. We first present the mapping of strategies from H to G and then from G to H. Note that in 
the game G, there is no choice for Player 2, and hence we remove the Player 2 strategies in the descriptions below. 

Mapping strategies from H to G. Let an be an observation-based Player- 1 strategy in H and pa = 
Soao-L s i a i-L s 2 ... s„ be a prefix in G. We define a Player-1 strategy a G in G as follows: aciPo) — a H(h^ 1 (pG))- 
Mapping strategies from G to H. Let aa be a Player-1 strategy in G and pu = soaosiaiS2 ... s„ be a prefix in H with 
o = OQaQOiai02 . . . o„ as its observation sequence. Note that as Player 2 has only one strategy (always playing _L) 
we omit it from discussion. Note that every p G ActMt(h(pH )) can have different actions with different probabilities 
enabled. We define a Player-1 strategy an in H as follows: for an action a G A we have 

olh{ph){o) = ^2 ObsSeq(h(p H ))(p') ■ a G (p')(a). 

p'e/\ctMt(h(p H )) 

We now show that the strategy an is an observation-based strategy for Player 1 in the POMDP. 

Lemma 4. The strategy an obtained from strategy oiq is an observation-based strategy for Player 1 in H. 

Proof. Let pn and p' H be two prefixes in H that match in observation sequence and we need to argue that an plays 
the same for both prefixes pn and p' H . Observe that since pn and p' H has the same observation sequence, we have 
ActMt(h(p h)) = ActMt(h(p' H )). Moreover it follows from Lemma[3]that ObsSeq(h(pH )) only depends on the 
observation sequence of pn and hence for all p' € ActMt(h(pn )) = ActMt(h(p' H )) we have ObsSeq (h(pn ))(p') = 
ObsSeq(h(p' H ))(p'). It follows that for all actions aeiwe have an (ph)(o) = Q; -H'(/ i/)( a )- It follows that o.h is 
observation based. I 

Correspondence of probabilities. In the following two lemmas we establish the correspondence of the probabilities 
for the mappings. 

Lemma 5. Let us consider the mapping of strategies from H to G. For all prefixes pn in H we have 

Pr^(Cone( Pff )) = Pr£°(Cone(ft(p fl ))). 

Proof. The proof is based on induction on the length of the prefix pn ■ We denote the last state of pn by £ n . 

Base case. For prefixes of length 1 where pn = £o we get Pr^" (Cone(^o)) = 1 and Pr^ G (Cone(/i(^o))) = 1- For all 
other prefixes both sides are equal to 0. Hence the base case follows. 

Inductive step. By inductive hypothesis we assume the result for prefixes pn of length n (i.e., we assume that 
Pr% H (Cone(p H )) = Pr^ G (Cone(/i(p H )))) and will show that 

Pr* H (Cone(p H a n £ n+ i)) = Pr^ G {Co<ne{h{p H a n £ n+1 ))). 

First we expand the left hand side (LHS) and by definition we get that: 

Pr* H (Cone(p H a n £ n+ i)) = Pr* H (Cone(p H )) ■ a H {p H )(a n ) ■ 5(£ n ,a n )(t n+ {). 

We now expand the right hand side (RHS) and get that: 

Pr" o G (Cone(h{p H a n £ n+1 ))) 

Pil G (Cone(h(p H )))- I ]T ObsSeq(h(p H W) ■ a G (p')(a n ) ■ A(£ n ,a n ,±)(£ n+1 ) 

\p'eActMt(h{p H )) 
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Using inductive hypothesis, the definition of the game, and the mapping of strategies we get on RHS: 

PrJ°(Cone(/i(p H a„4+i))) = 

Pr£»(Cone(p K ))- I £ ObsSeq(/ l (p H ))(p / ) ■ o H (rV))(o») • S(£ n , a n )(£ n+1 ) 

\p'GActMt(/i( Pff )) 

For all p' that does not match the observation sequence of H{ph), we have ObsSeq (/i(p#))(p') = (by Lemma[3]l, 
and as olh is observation based for all p' 6 ActMt(p^f ) that matches the observation sequence of h{pn), the strategy 
an plays the same. Let us denote by p 1 « h(pn) that p' matches the observation sequence of h(pn)- Then we have 

]T ObsSeq(/i(pH))0/)- oi H {h- l {p')){a n ) 

p'eActMt(h{p„)) 

ObsSeq(h(p H ))(p') • a H (/i- 1 (p'))(an) 

p' £Actm(h(p H )),p'&h(p H ) 

= ^2 ObsSeq {h(p H )) (p') • ol h (ph) (a n ) 

p'eActMt(h(p„)),p>txh(p H ) 

= a H (pH)(a n ); 

where the first equality follows as for all sequences p 1 that does not match the observation sequence of h(pu) we 
have ObsSeq(/i(pij))(p') = 0; the second equality follows as for all p' « h(pn) we have a_f/(/i _1 (p'))(a„) = 
c*if (Pff ) («n) (as a# is observation based); and the last equality follows because as ObsSeq is a probability distribution 
we have Y,p>eAc*Mt(h(p H ))y^h( PH ) ObsSeq(h(p H ))(p') = 1. Hence we have 

Pr" G (Cone(h(p H a n £ n+1 ))) = Pr" J? (Cone(p H )) • a H (p H )(a n ) ■ 5(l n ,a n )(£ n+1 ) 

Thus we have that LHS and RHS coincide and this completes the proof. I 

Lemma 6. Let us consider the mapping of strategies from G to H. For all prefixes pa in G we have 

Pr«-(Cone(/ l - 1 (p G ))) = Pr« G (Cone(p G )) 

Proof. The inductive proof is as follows and we will denote the last state of p G as t n . The base case is similar to the 
base case of Lemma|5] We now present the inductive case. 

Inductive step. By inductive hypothesis we assume the result for prefixes pq of length n (i.e., we assume that 
Pr^ ff (Cone(/i~ 1 (p G ))) = Pr^ G (Cone(p G ))) and will show that 

Pr«-(Cone(^- 1 (p G a„4+i))) = Pr£° (Cone(p G a r A+i))- 
First we expand the right hand side (RHS) and by definition we get that: 

Pr, Q o G (Cone(p G a„^+i))=Pr, a ( f(Cone( /OG ))- I £ ObsSeq(p G )(p') • a G (p')(a„) • A(£ n , a„, ±)(£ n +l) 

\p'eActMt(p G ) 

As A(£ n , o n , A-)(£ n +i) does not depend on p' we get: 

Pr" G (Cone(p G a„^ n+ i )) = Pr? G (Cone(p G )) • A(£ n , a„, ±)(£ n +i) • ( £ ObsSeq (p G )(p') • a G (p')(a n ) 

i p'eActMt(p G ) 
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We will now show that the expansion of the left hand side (LHS) also gives the same expression. Let pn = h 1 (pc)- 
By expanding the LHS we get: 

Pr£*(Cone(/j _1 (p G a„£„ + i))) = Pr^ H (Cone(/i _1 ( j o G ))) ■ a H (h^ 1 (p G ))(a n ) ■ S(£ n ,a n )(£ n+1 ) 

= Pr™ H (Cone(p H )) -oiHiPH^ian) ■ S(£ n ,a n )(£ n+1 ) 
= Pr^(Cone(p H )) • a H {p H )(a n ) ■ A(£ n ,a n , ±)(£ n+1 ) 
= Pr" o G (Cone( j o G )) • a H (p H )(a n ) ■ A(£ n , a n , ±)(£„+i); 

where the first equality is by definition; the second equality is by simply re-writing Ii~ 1 (pg) as pn ; the third equality 
is by the definition of A and <5; and the final equality is the inductive hypothesis. By definition of an we have 

a H {pH){a n ) = (Z)p'eActMt(pG) 0bsSec l(/ 9 G)(/ ') • aG(p')(°«)); and hence it follows that LHS and RHS coincide. 
Thus the desired result follows. I 

The previous two lemmas establish the equivalence of the probability measure and completes the reduction of 
POMDPs to games with probabilistic uncertainty. Hence the lower bounds for POMDPs also gives us the lower bound 
for games with probabilistic uncertainty. Hence Theorem[2] along with the reduction from POMDPs and Theorem[T] 
gives us the following result for games with probabilistic uncertainty (the results also summarized in Table[TJ. 

Theorem 3. The following assertions hold: 

1. (All-powerful Player 2). The sure, almost-sure and positive winning for safety objectives; the sure and almost-sure 
winning for reachability objectives and Biichi objectives; the sure and positive winning for coBiichi objectives; and 
the sure winning for parity objectives are all EXPTIME- complete for games with probabilistic uncertainty with 
all-powerful strategies for Player 2. The positive winning for reachability objectives can be solved in PTIME- 
complete. 

2. (Not all-powerful Player 2). The sure, almost-sure winning for safety objectives; and the sure winning for parity 
objectives are all EXPTIME- complete; the almost-sure winning for reachability objectives and Biichi objectives; 
the positive winning for safety and coBiichi objectives can be solved in 2EXPTIME and is EXPTIME-hard for 
games with probabilistic uncertainty without all-powerful strategies for Player 2. The positive winning for reach- 
ability objectives can be solved in EXPTIME. 

3. (Undecidability results). The positive winning problem for Biichi objectives, the almost-sure winning problem for 
coBiichi objectives, and the positive and almost-sure winning problem for parity objectives are undecidable for 
games with probabilistic uncertainty. 





Sure 


Almost 


Positive 


All-powerful 


Not-all-powerful 


All-powerful 


Not-all-powerful 


All-powerful 


Not-all-powerful 


Safety 


EXP-complete 


EXP-complete 


EXP-complete 


EXP-complete 


EXP-complete 


2EXP, EXP 


Reachability 


EXP-complete 


EXP-complete 


EXP-complete 


2EXP, EXP 


PTIME-complete 


EXP, PTIME 


Biichi 


EXP-complete 


EXP-complete 


EXP-complete 


2EXP, EXP 


Undec. 


Undec. 


coBiichi 


EXP-complete 


EXP-complete 


Undec. 


Undec. 


EXP-complete 


2EXP, EXP 


Parity 


EXP-complete 


EXP-complete 


Undec. 


Undec. 


Undec. 


Undec. 



Table 1. Complexity of games with probabilistic uncertainty with parity objectives, where for each entry we present 
the upper and lower bound, or undecidability. 
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6 Conclusion 



In this work we considered games with probabilistic uncertainty, which is natural for many problems, and has not 
been considered before. We present a reduction of such games to classical partial-observation games and a reduction 
of POMDPs to games with probabilistic uncertainty. As a consequence we establish the precise decidability frontier 
for games with probabilistic uncertainty. TableQ]summarizes our results. For most problems we establish EXPTIME- 
complete bounds. For some decidable problems we establish 2EXPTIME upper bounds, and EXPTIME lower bounds, 
and establishing the precise complexity results are interesting open problems. 
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